Emerging Cubes: Borders, size estimations and lossless reductions

نویسندگان

  • Sébastien Nedjar
  • Alain Casali
  • Rosine Cicchetti
  • Lotfi Lakhal
چکیده

Discovering trend reversals between two data cubes provides users with a novel and interesting knowledge when the real world context fluctuates: What is new? Which trends appear or emerge? Which tendencies are immersing or disappear? With the concept of Emerging Cube, we capture such trend reversals by enforcing an emergence constraint. We resume the classical borders for the Emerging Cube and introduce a new one which optimizes both storage space and computation time, provides a simple characterization of the size of Emerging Cubes, as well as classification and cube navigation tools. We soundly state the connection between the classical and proposed borders by using cube transversals. Knowing the size of Emerging Cubes without computing them is of great interest in particular for adjusting at best the underlying emergence constraint. We address this issue by studying an upper bound and characterizing the exact size of Emerging Cubes. We propose two strategies for quickly estimate their size: one based on analytical estimation, without database access, and one based on probabilistic counting using the proposed borders as the input of the nearoptimal algorithm HYPERLOGLOG. Due to the efficiency of the estimation algorithm various iterations can be performed to calibrate at best the emergence constraint. Moreover, we propose reduced and lossless representations of the Emerging Cube by using the concept of cube closure. Finally, we perform experiments for different data distributions in order to measure on one hand the size of the introduced condensed and concise representations and on the other hand the performance (accuracy and computation time) of the proposed estimation method. & 2009 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lossless Reduction of Datacubes using Partitions

Datacubes are specially useful for answering efficiently queries on data warehouses. Nevertheless the amount of generated aggregated data is huge with respect to the initial data which is itself very large. Recent research has addressed the issue of a summary of Datacubes in order to reduce their size. The approach presented in this paper fits in a similar trend. We propose a concise representa...

متن کامل

Automated Lossless Hyper-Minimization for Morphological Analyzers

This paper presents a fully automated lossless hyper-minimization method for finitestate morphological analyzers in Xerox lexc formalism. The method utilizes flag diacritics to preserve the structure of the original lexc description in the finite-state analyzer, which results in reduced size of the analyzer. We compare our method against an earlier solution by Drobac et al. (2014) which require...

متن کامل

Positive Borders or Negative Borders: How to Make Lossless Generator Based Representations Concise

A complete set of frequent itemsets can get undesirably large due to redundancy. Several representations have been proposed to eliminate the redundancy. Existing generator based representations rely on a negative border to make the representation lossless. However, negative borders of generators are often very large. The number of itemsets on a negative border sometimes even exceeds the total n...

متن کامل

Lossless Compression of Dynamic PET Data

We describe two approaches to lossless compression of dynamic PET data. In the first, a sequence of sinogram frames are compressed using differential encoding followed by lossless entropy-based compression. The second approach applies lossless compression to data stored in a sinogram/timogram format in which the arrival times of each photon pair are stored in spatial order, indexed by the sinog...

متن کامل

Perceptual audio coding using adaptive pre- and post-filters and lossless compression

This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and redundancy reductions into independent functional units. This contrasts traditional audio coding w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Syst.

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2009